-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: Version set unions as solvable requirements #56
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome!
I think it could be useful to modify the decide
function as well to be aware of multiple version sets in a single requirement. That way you get to conflicts of individual packages faster.
8d2d378
to
68e5ea7
Compare
@baszalmstra Can you elaborate on how this might work? As far as I understand, |
I think you only have to look at the first versionset in a requirement that still has undecided candidates. The idea is to force conflicts as early as possible to reduce backtracking. By only looking at the first versionset instead of the whole requirement you find smaller sets faster is my theory. E.g. foo ==1 | bar >0,<1000 has a large number of candidates but foo==1 only has 1. We want to select foo=1 first to make sure that if there are other conflicting requirements we find them soon. Does that make sense? |
I think that makes sense. Would sorting the candidates list for a version set union such that the version sets with fewer candidates come earlier suffice (in |
Changing the order would also change the outcome of the solution. I can imagine that the order of the union is therefore important. So I think this order should recide in the decide function itself. |
you also cannot know the number of available candidates beforehand because it depends on previous decisions |
a75c683
to
6b248e2
Compare
6b248e2
to
05ac114
Compare
I seemed to have misunderstood your suggestion at first. Having re-read it a few times, I now think I have understood what you are suggesting - to consider the size of only the first version set with available candidates in a requirement in the "least available candidates" heuristic of If my new understanding of your suggestion is correct (and please correct me if I am still wrong), I fail to see how this will reduce the amount of backtracking. In the example you gave of For example, consider a solution which needs two requirements |
Indeed. Maybe instead of theorizing about the implications, we should just benchmark it? I assume you have a large set of cases that you could test this with? |
I don't have test cases on hand, but I can definitely come up with some. How are you currently benchmarking and comparing the performance of various solution strategies? Do you look at the number of solver iterations, the number of times it backtracks, or some other metrics? If it works for you, I think it would be better to merge this PR as is and tackle this question in a separate issue/PR. Also, would you be able to look at the updated C++ bindings? I added support for using |
You can create a snapshot of a couple of cases with your I have a very large snapshot locally of the entire conda ecosystem (using this tool), I then solve for each individual package in that set (20k+) and measure the performance using this tool included in the repo. You can then see a nice distribution of the solve times and compare them against another implementation.
I can take a look at that, but next week at the earliest. Thanks again for this PR, you are doing amazing work! 🚀 |
That's great information, thanks a lot! I should be able to generate real-world use cases with union requirements as snapshots from my APT |
Thanks @eviltak, also for the C++ bindings! This looks good to me! |
This PR introduces the concept of version set unions and allows them to be used to specify solvable requirements that can be fulfilled by more than one version set (belonging to more than one package).
The change is facilitated by the introduction of a
Requirement
type, which is used instead ofVersionSetId
s to specify a required dependency of a solvable. A requirement can either be comprised of a single version set (aVersionSetId
), or multiple version sets (aVersionSetUnionId
). This allows existing code bases to continue using a single version set (VersionSetId
) as a requirement specification without ever having to touch version set unions (VersionSetUnionId
).Breaking changes
This PR introduces breaking changes - the main ones being the following:
Interner::version_sets_in_union
trait methodrequirements
field ofKnownDependencies
fromVec<VersionSetId>
toVec<Requirement>
.requirements
field ofDependencySnapshot
toversion_sets
version_set_unions
field toDependencySnapshot
(only affects those who are constructingDependencySnapshot
instances manually)API migration
Migrating existing depending projects that do not rely on version set unions to the new API is quite straightforward.
version_sets_in_union
in impls ofInterner
. It is perfectly fine for the implementation to beunimplemented!
, since if version set unions are unused this method will never be called.DependencyProvider::get_dependencies
to convert theVersionSetId
s used in therequirements
field ofKnownDependencies
toRequirement
s. SinceRequirement
implementsFrom<VersionSetId>
, this is as simple as callingFrom::from
orInto::into
on eachVersionSetId
.DependencySnapshot
, therequirements
field has been renamed toversion_sets
; all other functionality remains the same.